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Summary. A central problem of Quantitative Finance is that of formulating a 
probabilistic model of the time evolution of asset prices allowing reliable predictions 
on their future volatility. As in several natural phenomena, the predictions of such 
a model must be compared with the data of a single process realization in our 
records. In order to give statistical significance to such a comparison, assumptions 
of stationarity for some quantities extracted from the single historical time series, 
like the distribution of the returns over a given time interval, cannot be avoided. 
Such assumptions entail the risk of masking or misrepresenting non-stationarities 
of the underlying process, and of giving an incorrect account of its correlations. 
Here we overcome this difficulty by showing that five years of daily Euro/US-Dollar 
trading records in the about three hours following the New York market opening, 
provide a rich enough ensemble of histories. The statistics of this ensemble allows to 
propose and test an adequate model of the stochastic process driving the exchange 
rate. This turns out to be a non-Markovian, self-similar process with non-stationary 
returns. The empirical ensemble correlators are in agreement with the predictions of 
this model, which is constructed on the basis of the time-inhomogeneous, anomalous 
scaling obeyed by the return distribution. 



1 Introduction 

The analysis of many natural and social phenomena is hindered by the fact 
that one cannot replicate the dynamical evolution of the system under study. 
This may happen, for instance, for earthquakes [1], solar flares [2], large eco- 
systems [3], and financial markets [4]. If with a single time series available 
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we try to accommodate the historical data within a stochastic process de- 
scription, wc must assume a priori the existence of some statistical quantities 
which remain stable over time [4]. This entitles us to sample their values at 
different stages of the historical evolution, rather than at different instances 
of the process. For example, in the analysis of historical series in Finance it is 
usual to assume the stationarity of the distribution of return fluctuations and 
hence to detect their statistical features through sliding time interval empirical 
sampling. However, the plausible [5, 6, 7, 8, 9] nonstationarity of these fluctu- 
ations at intervals ranging from minutes to months would drastically alter the 
relation between some of the stylized empirical facts detected in this way, and 
the underlying stochastic process. In order to identify the correct model, one 
has to overcome this difficulty. The breaking of time-translation invariance 
possibly signalled by increments non-stationarity would represent a challenge 
in itself, being a genuine manifestation of dynamics out of equilibrium, like 
the aging properties observed in glassy systems [10]. 

In order to detect the possible presence of nonstationarity at certain time- 
scales for the distribution of the increments, one would need to have access 
to many independent realizations of the same process, repeated under simi- 
lar conditions. Quite remarkably, high-frequency financial time-series offer an 
opportunity of this kind, in which it is possible to directly sample an en- 
semble of histories. In Ref. [7] it has been proposed that when considering 
high-frequency EUR/USD exchange rate data as recorded during the first 
three hours of the New York market activity, independent process realiza- 
tions can tentatively be identified in the daily repetitions of the trading. This 
gives the interesting possibility of estimating quantities related to ensemble-, 
rather than time-averages. Here we profit of this opportunity by showing that 
a proper analysis of the statistical properties of this ensemble of histories nat- 
urally leads to the identification and validation of an original stochatic model 
of market evohition. The main idea at the basis of this model is that the 
scaling properties of the return distribution are sufficient to fully character- 
ize the process in the time range within which they hold. The same type of 
model has been recently proposed by some of the present authors to underlie 
more generally the evolution of financial indices also in cases when only single 
realizations are available [5]. In those cases the application of the model is 
less direct, and rests on suitable assumptions about the relation between the 
stationarized empirical information obtainable from the historical series and 
the underlying driving process. 

An interesting feature of the model discussed here and in Refs. [5, 6], is that 
the anomalous scaling of the return PDF enters in its construction on the basis 
of a property of correlated stability which generalizes the stability of Gaussian 
PDF's under independent random variables summation. This correlated sta- 
bility was shown recently to allow the derivation of novel, constructive limit 
theorems for the PDF of sums of many strongly dependent random variables 
obeying anomalous scaling [11]. In this perspective, the model wc present offers 
a valid alternative to more standard models of Finance based on Gaussianity 
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and independence. At the same time, the probabiUstic framework provided 
by our modclization presents clear formal analogies and parallels with those 
standard models. 



2 An ensemble of histories based on the returns of the 
EUR/USD exchange rate 

To address the above points, given the EUR/USD exchange rate at time t 
{t measured in tens of minutes) after 9.00 am New York time, S{t), let us 
define the return in the interval [t - T,t] as R{t,T) = \nS{t) - \nS{t - T), 
where t = 1,2, . . ., t > T. By storing the daily repetitions of the returns from 
March 2000 to March 2005, we obtain an ensemble of M = 1,282 realiza- 
tions {r'(t, T)}j 2 M ^^'^ discrete-time stochastic process R{t,T), with 
t ranging in almost three hours after 9.00 am NY time, i.e., 1 < f < 17. Below, 
the superscript "e" labels quantities empirically determined on the basis of 
this ensemble. The first key observation is that the empirical second moment 
ni2(t,l) = 5I^f=i[''''(^, l)]^Af systematically decreases as a fimction of t in 
the interval considered (see Fig. la). This is a clear indication of return non- 
stationarity of the underlying process at this time scale. In addition, an analy- 
sis of the nonlinear moments m% of the total return R{t, t) = In S(t) — In 5(0) 
for t > 1, 

1 ^ 

1=1 

shows that such a nonstationarity is accompanied by an anomalous scaling 
symmetry. Indeed, to a good approximation one finds ma{t,t) ~ in this 
range of t, where D ~ 0.364... is essentially independent of a (Fig. lb). 
Accordingly, the ensemble histograms for the PDF's of aggregated returns in 
the intervals [0,i\, PR(t,t)-, are consistent with the scaling collapse 

t"" PRit,t) {t'' r) = g{r) (2) 

reported in Fig. 2. The scaling function g identified by such collapse plot 
is manifestly non-Gaussian. It may also be assumed to be even to a good 
approximation^ . 

To further simplify our formulas below, wherever appropriate we will 
switch to the notations: Ri = R{i, 1) and = r{i, 1). Similarly r\ = r\i, 1) 
will indicate the i-ih return on a 10 min-scale in the Z-th history realization 

of our ensemble. 

An important empirical fact (Fig. Ic) is that the linear correlation between 
returns for non-overlapping intervals 

® We have detrended the data by subtracting from r\t,T) the average value 
X^;^]^ ?"'(t, Data skewness can be shown to introduce deviations much 

smaller than the statistical error-bars in the analysis of the correlators. 
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Fig. 1. Empirical ensemble analysis of the returns, (a) The line is given by 
{(j^)p [t^'^ - (t-lf'^], with {cr2)p = lrl)p = 2.3 • 10"^ and the best-fitted 
D — 0.358. (b) Analysis according to the ansatz in Eq. (2). The straight line 
characterizes a simple-scaling behavior with a best-fitted D = 0.364. (c) The lin- 
ear correlation vanishes for non-overlapping returns. 



\/m2{l, 1) m2{n, 1) 



(3) 



with n = 2, . . ., is negligible in comparison with the correlation of the absolute 
values of the same returns. At this time scale also correlators of odd powers of 
a return with odd or even powers of another return are negligible. Only even 
powers of the returns are strongly correlated. 
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Fig. 2. Non-Gaussian scaling function g. Empty [full] symbols are obtained by 
rescaling Pii(t,t) [Pfl(t,i)] according to Eq. (2) [Eq. (8)] for t = 1, 5, 10, 17. 

3 Self-similar model process 

The empirical facts listed above already enable us to suggest a very plausi- 
ble model for the stochastic process expected to generate the data. Both in 
physics and in Finance, a well established trend in modeling anomalous scaling 
is that of expressing the scaling functions, like our (7, as convex combinations 
of Gaussian PDF's with varying widths. This has clear mathematical advan- 
tages, since it is possible to express very general scaling functions with such 
convex combinations. In physics the representation in terms of mixtures of 
Gaussians often reflects the presence of some heterogeneity or polydispersity 
in the problem [12]. In Finance, the use of convex combinations of Gaussians 
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to represent return PDF's is naturally suggested by the fact that return time 
series show a variety of more or less long intervals characterized by peculiar 
values of the volatility (volatility clustering). The idea that Pint,t) can be 
represented as a mixture of Gaussians of varying widths is suggested by the 
same basic motivations which lead to the introduction of stochastic volatility 
models in Finance [13, 14, 15, 16]. In the light of the empirical facts, such 
a representation of the scaling function in the PDF of the aggregated return 
naturally suggests an adequate full modelization of the process generating 
the successive partial returns. Let us indicate by p{a) a normalized, positive 
measure in ]0, +oo[ such that we can represent g as: 



A suitable form of p can be easily identified, e.g. by matching its moments 
with those of g, and by relating the large cr behavior of p{(j) with the large \r\ 
behavior of g{r). For instance, p may decay as a power law at large cr's if the 
moments of g are expected to be infinite above a given order. These conditions 
enable us to fix a number of parameters in p such that the scaling function in 
Eq. (4) fits the data in the empirical collapse in Fig. 2. As discussed below, 
in our case the set of data on which we can count to construct histograms of 
g is relatively poor. So, our determinations of p will be rather qualitative. 

Once identified p, more ambitiously we may try to use it for a weighted 
representation of the joint PDF's of the su(;ccssive elementary returns 
i = 1,2, .. . generated in the process. Indeed, we may tentatively write the 
joint PDF of these returns in the following form: 



with n = l,2,...,17. The coefficients aj in the last equation have to be chosen 
consistent with the non-stationarity of the elementary returns reported in Fig. 
la and with the other statistical properties of the elementary and aggregated 
returns discussed in the previous section. It is straightforward to realize that 
(r^)p = (cr^)p , while {ri)p = and {rirj)p = for i ^ j, where {■)p denotes 
averages with respect to the joint PDF in Eq.(5), whereas {■)p those with 
respect to the PDF p. Likewise, we immediately realize that odd-odd or odd- 
even correlators of the iij's are strictly zero. Assuming validity of Eq. (5) 
means in first place that the i-dependence of a; must be chosen such to fit the 
values reported in Fig. la. The choice of the i dependence of aj must be also 
consistent with the simple scaling of the PDF of aggregated returns. Indeed, 
taking into account that R{t, t) = Ri + R2 + ■ ■ ■ + Rt, for t = 1, 2, . . . , 17, 
Eq.(5) implies that for the same t values 
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Comparing this result with Eq. (2), we see that it is necessary to choose the 
Oj's such that + + ■ ■ ■ + af — t^^ in order to be consistent with the 
empirical scaling in Eq. (2). This last requirement is satisfied if we put 

ai = ^i^D -{i-lfo, i = l,2,.... (7) 

A first problem is then to see whether this form of the a, coefiicients is com- 
patible with the i-dependence already implied by the non-stationarity. Eq. (7) 
appears to be reasonably well compatible with the trend of the empirical mean 
square elementary returns m2{i, 1)- Indeed, given ((T^)p — {r\)p = 2.3 • 10"'^, 
the best fit in Fig. la is obtained with D = 0.358 ... in the expression for 
(rf)p. The expectation value of ct^ is with respect to the p entering the in- 
tegral representation (3) already chosen for g. Remarkably, the value of D is 
very close to the estimate of D obtained above through the analysis of the 
moments oipiy^t^t)- 

Summarizing, Eq.(5) and the above conditions on the a,'s define a non- 
Markovian stochastic process with linearly uncorrelated increments and a 
PDF of returns satifying a time inhomogeneous scaling of the form: 

where both t and T are understood to be integer multiples of the 10 minutes 
unit. In Fig. 2 it is shown that the data collapse of both pR{t,t) and PR{t,i) ai'e 
indeed compatible with the same non-gaussian PDF g. 

From the point of view of probability theory, the structure of our process 
in Eq.(5) rests on a stability property for PDF's of sums of dependent random 
variables [11]. Indeed, if we indicate by p'^{ki,k2, ■■■kn) the Fourier transform 
(characteristic function) of the joint PDF of the first n returns (1 < n < 17), 
a direct calculation yields 

r{k,k,...,k)=f{n''K) (9) 

and 

p''iO,..,ki,..,0)=p\aiki), i = l,...,n. (10) 

For D = 1/2 these relations have the the same form as those holding in the case 
of independent variables, when . . . , kn) = P^ (ki) p^ {k2) ■ ■ ■ p^{kn), and 

p^ is a Gaussian characteristic function. However, even for D = 1/2 a general 
p{a) implies dependence of the Ri&. To recover the independent case one needs 
further to choose p{a) — 6{u — uq). Thus, the superposition of independent 
Gaussian processes with diff'erent cr's in Eq.(5) implies an extension of the 
basic stability properties of the independent Gaussian variables case to the 
dependent case. This extension also allows to derive limit theorems for the 
anomalous scaling of sums of many dependent random variables [11]. 



8 



Pulvio Baldovin , Dario Bovina , Francesco Camana , and Attilio L. Stella 



4 Correlations structure 

As discussed above, the identification of p may be used to reconstruct the joint 
PDF of the returns i?j's as in Eq. (5). In this section we elaborate further on 
this point, by performing a detailed comparison between model predictions 
(based on an explicit expression for p) and empirical determinations of various 
two-point correlators. 

Considering the data collapse of both pR(t,t) and PR{t,\) in Fig- 2, we 
propose the following functional form for p (see also [11]): 

= ^ Tl J' ^ Wmin, +00[, < 7 < J, (11) 

where ^ is a normalization factor, and > is a parameter influencing the 
width of the distribution g. Notice that p{a) ^ for a ':$> 1. The rational 

behind this choice for p is that one can use the exponents 7, 6 to reproduce 
the large \x\ behavior of g{x), and then play with the other parameters to 
obtain a suitable flt of the scaling function, for instance the one reported in 
Fig. 2. 

The first two-point correlator we consider in our analysis is 

(|fi(l,l)r 1^1^(^,1)1^) dril" \rn\\ 

Ka/3(l,nl = a — = (12) 

{\R{i,ir),mn,i)\x {{nnAKf}, 

with n > 1, and a,j5 € . A value Ka^p ^ 1 means that returns on non- 
overlapping intervals arc dependent. Using Eq. (5) it is possible to express a 
general many-return correlator in terms of the moments of p. For example, 
from Eq. (5) we have 

dril" \Tnf)j> = B„Bp a-iai {a"+P)p, (13) 

with 

Br, 



We thus obtain 



BcBp (|rir+^)p 



Two model-predictions in Eq. (15) are: (i) Despite the non-stationarity of 
the increments i?j's, Ka,i3{l,n) is independent of n; (ii) The correlators are 
symmetric, i.e., Ka,i3 — Ki3,a — 0. 

We can now compare the theoretical prediction of the model for ^^^^(l, n), 
Eq. (15), with the empirical counterpart 
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Fig. 3. Constancy of /t^ n. Dashed lines are model-predictions. 



which we can calculate from the EUR/USD dataset. Notice that once p is fixed 
to fit the one-time statistics in Fig. 2, in this comparison we do not have any 
free parameter to adjust. Also, since our ensemble is restricted to M = 1, 282 
realizations only, large fluctuations, especially in two-time statistics, are to be 
expected. 

Fig. 3 shows that indeed non-overlapping returns are strongly correlated in 
the about three hours following the opening of the trading session, since ^ ^ 
1. In addition, the constancy of ^ is clearly suggested by the empirical data. 
In view of this constancy, we can assume as error-bars for ^ the standard 
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Fig. 4. Symmetry of /ta^/j. Error-bars are determined as in Fig. 3. 



are also in agreement with the theoretical predictions for Ka,p based on our 
choice for p. In this and in the following comparisons it should be kept in 
mind that, although not explicitly reported in the plots, the uncertainty in the 
identification of p of course introduces an uncertainty in the model-predictions 
for the correlators. 

In Fig. 4 we report that also the symmetry Ka,^ = K^^a is emiprically 
verified for the EUR/USD dataset. The validity of this symmetry for a process 
with non-stationary increments like the present one is quite remarkable. 

A classical indicator of strong correlations in financial data is the volatility 
autocorrelation, defined as 
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Fig. 5. Volatility autocorrelation. Dashed line is the model-prediction. 



CI J. , Tfc ) — o 

{\ri\')p - {\ri\}l 



(17) 



In terms of the moments of p, through Eq. (13) we have the following expres- 
sion for c : 

Unlike Ka,0, c is not constant in n. The comparison with the empirical volatil- 
ity autocorrelation, 
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Fig. 6. Correlators K'^ a. Dashed lines are model-predictions. 



yields a substantial agreement (See Fig. 5). The error-bars in Fig. 5 are ob- 
tained by dynamically generating many ensembles of M = 1, 282 realizations 
each, according to Eq. (5) with our choice for p, and taking the standard 
deviations of the results. Again, the uncertainty associated to the theoretical 
prediction for c is not reported in the plots Problems concerning the numerical 
simulation of processes like the one in Eq. (5) are discussed in Ref. [11]. 

A further test of our model can be made by analyzing, in place of those 
of the increments, the non-linear correlators of i?(t, t), with varying t. To this 
purpose, let us define 



{\R{h,hr \Rit2,h)f) 
{\R{h,h)n {\R{t2,h)fy 



(20) 
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with t2 > ti. Model calculations similar to the previous ones give, from Eq. 
(5), 



(21) 



where 



^(-r?/(2tr)) 



exp[-(ri-r,)V(2ti°-2tr)] 



(22) 



According to Eq. (21), Ka,i3 is now identified by botli p and D. Moreover, 
it explicitly depends on ti and t2- The comparison between Eq. (21) and the 
empirical quantity 



2^1=1 






r\t2,t2) 








'\ri{t 


2,t2)f 



(23) 



reported in Fig. 6 (the error-bars are determined as in Fig. 5) supplies thus 
an additional validation of our model. 



5 Conclusions 

In the present work we addressed the problem of describing the time evolution 
of financial assets in a case in which one can try to compare the predictions 
of the proposed model with a relatively rich ensemble of history realizations. 
Besides the fact that considering the histories at disposal for the EUR/USD 
exchange rate as a proper ensemble amounts to a main working assumption, 
a clear limitation of such an approach is the relative poorness of the ensemble 
itself. Indeed, the simulations of our model suggest that in order to reduce 
substantially the statistical fluctuations one should dispose of ensembles larger 
by at least one order of magnitude. 

In spite of these limitations, we believe that the non-Markovian model we 
propose [5, 6, 11] is validated to a reasonable extent by the analysis of the 
data, especially those pertaining to the various correlators we considered. In 
this respect it is important to recall that the first proposal of the time in- 
homogeneous evolution model discussed here has been made in a study of a 
single, long time series of the DJI index in Ref. [5]. In that context, the re- 
turns time inhomogeneity, Eq. (8), was supposed to underlie the stationarized 
information provided by the empirical PDF of the returns. This assumption 
allowed there to give a justification of several stylized facts, like the scaling 
and multiscaling of the empirical return PDF and the power law behavior in 
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time of the return autocorrelation function. We believe that the results ob- 
tained in the present report, even if pertaining to a different time-scale (tens 
of minutes in place of days), constitute an interesting further argument in 
favor of a general validity of the model. 

The peculiar feature of this model is that of focussing on scaling and 
correlations as basic, closely connected properties of assets evolution. This was 
strongly inspired by what has been learnt in the physics of complex systems 
in the last decades [17, 18, 19], where methods like the rcnormalization group 
allowed for the first time systematic treatments of these properties [6] . At the 
same time, through the original probabilistic parallel mentioned in Section 3, 
our model maintains an interesting direct contact with the mathematics of 
standard formulations based on Brownian motion, of wide use in Finance. 
This last feature is very interesting in the perspective of applying our model 
to problems of derivative pricing [13, 14, 15, 16, 20]. 
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